13 research outputs found

    Small steps and giant leaps: Minimal Newton solvers for Deep Learning

    Full text link
    We propose a fast second-order method that can be used as a drop-in replacement for current deep learning solvers. Compared to stochastic gradient descent (SGD), it only requires two additional forward-mode automatic differentiation operations per iteration, which has a computational cost comparable to two standard forward passes and is easy to implement. Our method addresses long-standing issues with current second-order solvers, which invert an approximate Hessian matrix every iteration exactly or by conjugate-gradient methods, a procedure that is both costly and sensitive to noise. Instead, we propose to keep a single estimate of the gradient projected by the inverse Hessian matrix, and update it once per iteration. This estimate has the same size and is similar to the momentum variable that is commonly used in SGD. No estimate of the Hessian is maintained. We first validate our method, called CurveBall, on small problems with known closed-form solutions (noisy Rosenbrock function and degenerate 2-layer linear networks), where current deep learning solvers seem to struggle. We then train several large models on CIFAR and ImageNet, including ResNet and VGG-f networks, where we demonstrate faster convergence with no hyperparameter tuning. Code is available

    Semi-Supervised Learning with Scarce Annotations

    Full text link
    While semi-supervised learning (SSL) algorithms provide an efficient way to make use of both labelled and unlabelled data, they generally struggle when the number of annotated samples is very small. In this work, we consider the problem of SSL multi-class classification with very few labelled instances. We introduce two key ideas. The first is a simple but effective one: we leverage the power of transfer learning among different tasks and self-supervision to initialize a good representation of the data without making use of any label. The second idea is a new algorithm for SSL that can exploit well such a pre-trained representation. The algorithm works by alternating two phases, one fitting the labelled points and one fitting the unlabelled ones, with carefully-controlled information flow between them. The benefits are greatly reducing overfitting of the labelled data and avoiding issue with balancing labelled and unlabelled losses during training. We show empirically that this method can successfully train competitive models with as few as 10 labelled data points per class. More in general, we show that the idea of bootstrapping features using self-supervised learning always improves SSL on standard benchmarks. We show that our algorithm works increasingly well compared to other methods when refining from other tasks or datasets.Comment: Workshop on Deep Vision, CVPR 202

    Automatically Discovering and Learning New Visual Categories with Ranking Statistics

    Full text link
    We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes. This setting is similar to semi-supervised learning, but significantly harder because there are no labelled examples for the new classes. The challenge, then, is to leverage the information contained in the labelled images in order to learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data. In this work we address this problem by combining three ideas: (1) we suggest that the common approach of bootstrapping an image representation using the labeled data only introduces an unwanted bias, and that this can be avoided by using self-supervised learning to train the representation from scratch on the union of labelled and unlabelled data; (2) we use rank statistics to transfer the model's knowledge of the labelled classes to the problem of clustering the unlabelled images; and, (3) we train the data representation by optimizing a joint objective function on the labelled and unlabelled subsets of the data, improving both the supervised classification of the labelled data, and the clustering of the unlabelled data. We evaluate our approach on standard classification benchmarks and outperform current methods for novel category discovery by a significant margin.Comment: ICLR 2020, code: http://www.robots.ox.ac.uk/~vgg/research/auto_nove

    RELATE: Physically Plausible Multi-Object Scene Synthesis Using Structured Latent Spaces

    Full text link
    We present RELATE, a model that learns to generate physically plausible scenes and videos of multiple interacting objects. Similar to other generative approaches, RELATE is trained end-to-end on raw, unlabeled data. RELATE combines an object-centric GAN formulation with a model that explicitly accounts for correlations between individual objects. This allows the model to generate realistic scenes and videos from a physically-interpretable parameterization. Furthermore, we show that modeling the object correlation is necessary to learn to disentangle object positions and identity. We find that RELATE is also amenable to physically realistic scene editing and that it significantly outperforms prior art in object-centric scene generation in both synthetic (CLEVR, ShapeStacks) and real-world data (cars). In addition, in contrast to state-of-the-art methods in object-centric generative modeling, RELATE also extends naturally to dynamic scenes and generates videos of high visual fidelity. Source code, datasets and more results are available at http://geometry.cs.ucl.ac.uk/projects/2020/relate/

    PET Reconstruction With an Anatomical MRI Prior Using Parallel Level Sets.

    Get PDF
    The combination of positron emission tomography (PET) and magnetic resonance imaging (MRI) offers unique possibilities. In this paper we aim to exploit the high spatial resolution of MRI to enhance the reconstruction of simultaneously acquired PET data. We propose a new prior to incorporate structural side information into a maximum a posteriori reconstruction. The new prior combines the strengths of previously proposed priors for the same problem: it is very efficient in guiding the reconstruction at edges available from the side information and it reduces locally to edge-preserving total variation in the degenerate case when no structural information is available. In addition, this prior is segmentation-free, convex and no a priori assumptions are made on the correlation of edge directions of the PET and MRI images. We present results for a simulated brain phantom and for real data acquired by the Siemens Biograph mMR for a hardware phantom and a clinical scan. The results from simulations show that the new prior has a better trade-off between enhancing common anatomical boundaries and preserving unique features than several other priors. Moreover, it has a better mean absolute bias-to-mean standard deviation trade-off and yields reconstructions with superior relative l2-error and structural similarity index. These findings are underpinned by the real data results from a hardware phantom and a clinical patient confirming that the new prior is capable of promoting well-defined anatomical boundaries.This research was funded by the EPSRC (EP/K005278/1) and EP/H046410/1 and supported by the National Institute for Health Research University College London Hospitals Biomedical Research Centre. M.J.E was supported by an IMPACT studentship funded jointly by Siemens and the UCL Faculty of Engineering Sciences. K.T. and D.A. are partially supported by the EPSRC grant EP/M022587/1.This is the author accepted manuscript. The final version is available from IEEE via http://dx.doi.org/10.1109/TMI.2016.254960

    Joint reconstruction of PET-MRI by parallel level sets

    No full text
    Combined positron emission tomography (PET) and magnetic resonance imaging (MRI) scanners acquire simultaneously functional PET and anatomical or functional MRI data. As the data of both modalities are likely to show similar structures we aim to exploit this by joint reconstruction of PET and MRI. In a Bayesian formulation, this can be achieved by adding prior information encoding that the images of the two modalities are not independent. Structural similarity can be modeled by the alignment of the image gradients or equivalently their level sets being parallel. Therefore we can combine the objective functions of both modalities and penalize image pairs which do not have parallel level sets. Our results show that combining the reconstruction from heavily under-sampled MRI and noisy PET data can lead to less under-sampling artifacts in MRI images and better defined PET images

    NiftyPET:a High-throughput Software Platform for High Quantitative Accuracy and Precision PET Imaging and Analysis

    No full text
    Abstract We present a standalone, scalable and high-throughput software platform for PET image reconstruction and analysis. We focus on high fidelity modelling of the acquisition processes to provide high accuracy and precision quantitative imaging, especially for large axial field of view scanners. All the core routines are implemented using parallel computing available from within the Python package NiftyPET, enabling easy access, manipulation and visualisation of data at any processing stage. The pipeline of the platform starts from MR and raw PET input data and is divided into the following processing stages: (1) list-mode data processing; (2) accurate attenuation coefficient map generation; (3) detector normalisation; (4) exact forward and back projection between sinogram and image space; (5) estimation of reduced-variance random events; (6) high accuracy fully 3D estimation of scatter events; (7) voxel-based partial volume correction; (8) region- and voxellevel image analysis. We demonstrate the advantages of this platform using an amyloid brain scan where all the processing is executed from a single and uniform computational environment in Python. The high accuracy acquisition modelling is achieved through span-1 (no axial compression) ray tracing for true, random and scatter events. Furthermore, the platform offers uncertainty estimation of any image derived statistic to facilitate robust tracking of subtle physiological changes in longitudinal studies. The platform also supports the development of new reconstruction and analysis algorithms through restricting the axial field of view to any set of rings covering a region of interest and thus performing fully 3D reconstruction and corrections using real data significantly faster. All the software is available as open source with the accompanying wiki-page and test data.Support for this work was received from the MRC Dementias Platform UK (MR/N025792/1), the MRC (MR/J01107X/1, CSUB19166), the EPSRC (EP/H046410/1, EP/J020990/1, EP/K005278, EP/M022587/1), AMYPAD (European Commission project ID: ID115952, H2020-EU.3.1.7. - Innovative Medicines Initiative 2), the EU-FP7 project VPH-DARE@IT (FP7- ICT-2011-9-601055), the NIHR Biomedical Research Unit (Dementia) at UCL and the National Institute for Health Research University College London Hospitals Biomedical Research Centre (NIHR BRC UCLH/UCL High Impact Initiative- BW.mn.BRC10269), the NIHR Queen Square Dementia BRU, Wolfson Foundation, ARUK (ARUKNetwork 2012-6-ICE; ARUK-PG2014-1946), European Commission (H2020-PHC-2014-2015-666992), the Dementia Research Centre as an ARUK coordinating centre. M. J. Ehrhardt acknowledges support by the Leverhulme Trust project 'Breaking the non-convexity barrier', EPSRC grant 'EP/M00483X/1', EPSRC centre 'EP/N014588/1', the Cantab Capital Institute for the Mathematics of Information, and from CHiPS (Horizon 2020 RISE project grant). This publication solely reflects the author's view and neither IMI nor the European Union, and EFPIA are responsible for any use that may be made of the information contained herein
    corecore